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High-performing and dramatically improving schools are led by 
strong principals. The Quality School Leadership (QSL) services 
at Learning Point Associates give educators the tools they need 
to hire and assess their leaders. Our Quality School Leadership 
Identification (QSL-ID) process is a standardized hiring procedure 
built from research-based tools that local hiring committees can 
use to reach consensus when selecting a new school principal. 

The QSL-ID process guides schools and districts through each 
of the specific steps to hiring the right school leader and allows 
them to: 

• Establish an effective hiring committee that understands the 
specific leadership needs of the school or district. 

• Recruit principal candidates based on the criteria that best 
meet school and district goals. 

• Identify the strongest candidates and conduct an onsite 
performance assessment of finalists. 

• Plan for a smooth leadership transition. 

Learn more about our QSL-ID services at 

http://www.learningpt.org/expertise/educatorquality/ 

schoolLeadershipIdentification.php/. 
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Introduction 



Assessing school principal performance is both necessary and challenging. It is necessary 
because principal performance assessments offer districts an additional mechanism to ensure 
accountability for results and reinforce the importance of strong leadership practices. After all, 
school principals are second only to classroom teachers as the most influential school factor in 
student achievement (Hallinger & Heck, 1998; Leithwood, Louis, Anderson, & Wahlstrom, 
2004). Principal performance assessments also provide central office administrators and principals, 
themselves, information with which to build professional learning plans and chart professional 
growth. Such assessments are also challenging because principals’ practice and influence on 
instruction is sometimes not readily apparent. 



During the past five years, many states have begun using validated measures in summative 
assessments of novice principal competency as a basis for certification decisions. These measures 
may be psychometrically sound but often cannot be used for formative performance assessments 
or professional development planning (Reeves, 2005). To be used as a formative performance 
assessment, test results would have to be disaggregated, and their underlying constructs would 
need to be made transparent to readers. In addition, administrative and analytic control would 
have to be transferred to local educators (see “Formative Versus Summative Assessment: 

What Is the Difference?”). 



Although standardized tests are used for 
certification purposes, other types of assessments 
are being used by school districts to ascertain 
principal performance and plan professional 
learning. So, independent of standardized 
measures, which tend to serve summative 
purposes, other assessments are being used 
formatively to judge principal performance. 
Scanning the field, Goldring et al. (2009) found 
that school districts often use idiosyncratic and 
inconsistent measures for principal performance 
assessment. Districts’ principal performance 
assessments may or may not be aligned with 
existing professional standards, and they 
often lack justification or documentation of 
psychometric rigor (Heck & Marcoulides, 1996). 
In other words, district performance assessments 
allow for formative feedback, but the measures 
vary in quality and rigor. This variance opens 
up the possibility that scores are inaccurate or 
performance assessments do not reflect research- 
based standards of the field. 



f \ 

Formative Versus Summative 
Assessment: What Is the Difference? 

No matter their form, assessments 
generally have two purposes. An 
assessment used for summative purposes 
tends to inform a decision about the test 
taker’s competence, and there is no 
opportunity for remediation or development 
after completion. An assessment used for 
formative purposes is also a measure of 
competence, but results are used to inform 
future actions. For example, a formative 
purpose of performance assessment 
is to inform a principal’s professional 
development plan. A single assessment 
may serve formative and summative 
purposes in different situations. 

The Learning Point Associates scan included 
only publicly available and rigorously tested 
measures that are useful for formative 
assessment purposes. 
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Superintendents and others who seek to improve 
principal performance assessment may select 
one or more of these measures or may develop 
and validate their own measures. Regardless 
of origin, assessments should be validated and 
reliable to ensure accuracy and applicability 
to principal performance. 

This brief reports results of a scan of 
publicly available measures conducted by 
Learning Point Associates staff. The measures 
included in this review are expressly intended 
to evaluate principal performance and have 
varying degrees of publicly available evidence 
of psychometric testing. The review of this 
information is intended to inform decision 




makers’ selection of job performance 
instruments used for hiring, performance 
assessment, and tenure decisions. This brief also 
addresses the importance of standards-based 
measures, the need for establishing reliability 
and validity, and the measures that are more 
widely accepted and psychometrically sound. 

New Standards for 
Principal Performance 

Knowledge about what strong principals do 
to develop and maintain teaching and learning 
excellence has evolved with the changes in the 
context of schooling and improved school 
leadership research. School principals are 
being asked to ensure that all students have 
access to high-quality instruction and all 
educators are held accountable for student 
learning. These tasks deepen and broaden 
principals’ professional responsibilities beyond 
their traditional roles as building managers. 

New standards for principal performance 
have emerged and reflect new emphases in 
the profession. The Educational Leadership 
Policy Standards: ISLLC 2008, for example, 
are a widely recognized and referenced 
principals standards list (Council of Chief 
State School Officers, 2008). The ISLLC 
Standards contain six domains for principal 
professional practice: 

• Setting a widely shared vision for learning 

• Developing a school culture and 
instructional program conducive to student 
learning and staff professional growth 

• Ensuring effective management of the 
organization, operation, and resources 
for a safe, efficient, and effective 
learning environment 

• Collaborating with faculty and 
community members, responding to 
diverse community interests and needs, 
and mobilizing community resources 
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• Acting with integrity, fairness, and in 
an ethical manner 

• Understanding, responding to, and 
influencing the political, social, legal, 
and cultural context 

As the ISLLC Standards suggest, principals 
must work within a well-formed ethical code 
to oversee instructional quality; develop 
teacher talents; establish a learning culture 
in schools; and work within and beyond the 
school to secure financial, human, and 
political capital to maintain and advance 
organizational operations. 

The ISLLC Standards have been integrated 
into many states’ licensure procedures 
through the following means: 

• Alignment of ISLLC Standards with state 
principal professional standards 

• Requirement of all principal candidates to 
receive a certain score on a standardized 
examination, which has been validated 
against ISLLC Standards, as a prerequisite 
for certification 

• Requirement of state-recognized preservice 
principal preparation programs to display 
and defend how program activities prepare 
and determine whether candidates meet 
ISLLC Standards 

Less is known about the integration 
and alignment of ISLLC Standards, other 
standards lists, or other promising leadership 
practices with principal performance 
assessments. 

Reliability and Validity 

To be included in the scan, documentation 
of validity and reliability testing had to be 
published. Such testing provides evidence 
of psychometric rigor, which should be 
considered by purchasers and users of 
performance assessments. 



Assessments are considered valid when they 
measure what they are intended to measure. 
There are many types of validity, but two 
of the more salient types in constructing 
performance measures are content and 
construct validity. Content validity is 
established by ensuring that the test items 
under consideration measure all of the 
dimensions or facets of a given construct, 
such as principal performance. Content 
validity can be established by linking the test 
or other items to a set of standards, such as 
the ISLLC Standards, or practices, such as 
leadership effectiveness. 

Construct validity is determined by the degree 
to which test items measure a “construct,” 
which is the element that the items purport 
to assess. For example, a construct may be 
ISLLC Standard 5, “An education leader 
promotes the success of every student by 
acting with integrity, fairness, and in an 
ethical manner” (Council of Chief State 
School Officers, 2008, p. 15). For this 
construct, multiple test items or another 
method for collecting evidence would be 
needed to determine the degree to which 
the standard is met. In this case, testing for 
construct validity would determine how well 
items and observations measure principals’ 
abilities to act with integrity, fairness, and 
in an ethical manner. 

Reliability is a measure of consistency and 
stability. A measure has reliability when the 
responses are consistent and stable for each 
individual who takes the test. In other words, 
a principal should receive relatively the same 
score on multiple administrations of a given 
test if all factors remain the same. 
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The Reviewed Measures 

Of the 20 school principal performance 
assessment measures identified through 
Google Scholar, eight met preestablished 
criteria for inclusion in the review (see “How 
Assessments Were Selected for Review”). 

Some measures, such as the ETS School 
Leadership Series examinations, provided 
extensive documentation of reliability and 

f \ 

How Assessments Were Selected 
for Review 

Learning Point Associates staff conducted a 
keyword search of Google Scholar to locate 
school principal performance assessment 
instruments. More than 5,000 articles were 
initially identified, but the majority of articles 
were not pertinent. To winnow the list further, 
publicly available performance assessment 
support documents had to report that the 
assessment was 

• Intended for use as a performance 
assessment. 

• Psychometrically tested for reliability 
and validity. 

• Publicly available for purchase. 

For the purposes of the review, psychometrically 
sound means that the instrument must be 
tested for validity and reliability using 
accepted testing measures. A minimum 
reliability rating of 0.75 must be achieved. 

Also, content validity and/or construct 
validity testing must have occurred. 

Using these criteria, 20 assessments were 
identified, and eight principal performance 
assessment instruments were included in 
the final review. 

V ) 



validity testing but no information about 
the formative use of results in performance 
assessment, so this measure was not included 
in the review. Other measures, such as 
the Chicago Public Schools’ principal 
performance rubric, are clearly intended 
for use during performance assessments, but 
no documentation was available about the 
validity or reliability of these measures. 

The following principal performance 
assessments were included in the review and 
may be useful resources for superintendents, 
human resource directors, and others who 
are charged with gauging principal skills and 
abilities for hiring, performance assessment, 
and tenure decisions. Table 1 provides 
additional information about each of the 
measures included in this review (see p. 7). 

Change Facilitator 
Style Questionnaire 

Vandenberghe (1988) developed the Change 
Facilitator Style Questionnaire (CFSQ) to 
measure the extent to which leaders can 
facilitate change (see School Administrators 
of Iowa, 2003). In CFSQ, three different 
approaches have been identified as change 
facilitator styles: initiator, manager, and 
responder. Data are categorized into 
three clusters with two scales/dimensions 
embedded within each cluster: 

• Cluster 1. Concern for People: Scale 1 
(Social/Informal) and Scale 2 (Formal/ 
Meaningful) 

• Cluster 2. Organizational Efficiency: 

Scale 3 (Trust in Others) and 
Scale 4 (Administrative Efficiency) 

• Cluster 3. Strategic Sense: Scale 5 (Day- 
to-Day) and Scale 6 (Vision and Planning) 
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Diagnostic Assessment of School 
and Principal Effectiveness 

Ebmeier (1992) developed this measure to 
identify the strengths of schools and their 
leaders so that school improvement plans 
and principal professional development goals 
would be better informed. To complete the 
assessment, separate surveys are completed 
by students, teachers, parents, principals, and 
principal supervisors. The measures indicate 
how these groups view themselves, school 
leadership, and school performance. Multiple 
measures are completed by multiple groups 
to identify matches between school leader 
traits and school characteristics. These 
measures can be used separately depending 
on their purpose. For more information, 
see Ebmeier (1991). 

Instructional Activity Questionnaire 

This measure was developed by Larsen (1987) 
as a performance assessment tool that 
specifically addresses instructional leadership 
aspects of principals’ work (as cited in Heck, 
Larsen, & Marcoulides, 1990). The measure 
was developed through an extensive review of 
the school principal effectiveness literature. 

Leadership Practices Inventory 

Kouzes and Posner (2002) developed the 
Leadership Practices Inventory (LPI) by 
extensively interviewing and surveying leaders, 
including principals, to identify best leadership 
practices. Thus, LPI views leadership practices 
as transferrable across professional types. What 
works to inspire people in business settings 
also may work in educational settings. LPI’s 
domains are as follows: (1) modeling the way, 



(2) inspiring a shared vision, (3) challenging 
the process, (4) enabling others to act, and 
(5) encouraging the heart. This measure 
has found widespread appeal across many 
disciplines, and LPI can be completed as an 
online or print survey. For more information, 
see Kouzes and Posner (n.d.). 

Performance Review Analysis and 
Improvement System for Education 

The Performance Review Analysis and 
Improvement System for Education (PRAISE) 
assessment system was developed through an 
extensive review of school administrator 
effectiveness literature. As such, PRAISE 
domains are not specifically aligned with 
professional standards. The PRAISE 
domains are problem solving, relations 
with teachers, and professional qualities and 
competencies. PRAISE is a print assessment 
to be completed by the principal and his or 
her supervisor. For more information, see 
Knoop and Common (1985). 

Principal Instructional 
Management Rating Scale 

Hallinger and Murphy (1985) developed the 
Principal Instructional Management Rating 
Scale (PIMRS) to determine the degree to 
which principals serve as instructional 
managers. PIMRS also provides exemplars 
of each construct, which may be used by 
raters to identify changes in their own or 
others’ practices. PIMRS focuses on several 
constructs, including the dedicated use of 
time for improving instruction, coordinating 
curriculum, and evaluating instruction. For 
more information, see Leadingware (2008). 



MEASURING PRINCIPAL PERFORMANCE 



5 



Principal Profile 

The Principal Profile was developed through 
extensive interview and consultation with 
principals, teachers, superintendents, and 
department heads. The authors consulted 
with practitioners to establish validity and 
reliability but also to ensure that the measure 
was practical for use in school/district settings. 
Two key assumptions inform the tool: 

(1) student growth should be a benchmark 
for school leader effectiveness and a factor in 
performance evaluation and (2) school leader 
effectiveness is marked by consistency of 
actions, in that principals need a well-defined 
set of purposes and the skill and knowledge 
to achieve them on a consistent basis. For 
more information, see Teithwood and 
Montgomery (1986) and Leithwood (1987). 




Vanderbilt Assessment of 
Leadership in Education 

Since the Vanderbilt Assessment of Leadership 
in Education (VAL-ED) was developed in 2006, 
it has become one of the most widely used 
and respected measures of school leadership 
performance assessment. Like the Diagnostic 
Assessment of School and Principal Effectiveness, 
VAL-ED assesses principal performance by 
gathering information from principals, teachers, 
and principal supervisors. The results from 
VAL-ED produce a quantitative diagnostic 
profile that is linked to the ISLLC standards. 
VAL-ED is based on a thorough examination 
of the research literature including a conceptual 
framework within which to place the scale. 
For more information, see Vanderbilt Peabody 
College (n.d.) and Porter, Murphy, Goldring, 
and Elliot (2006). 

Summary of Findings 

Table 1 synthesizes findings from the review 
of instruments. In the table, the content focus 
of the assessment (e.g., principal as change 
facilitator or principal as instructional 
leader) and evaluation approach (e.g., self- 
reflection survey or 360-degree evaluation) are 
indicated in the column labeled “Approach.” 
Validity measures and testing methods are 
generally described. In the “Reliability” 
column, a benchmark of 0.80 was used to 
indicate “moderate” reliability, and a 
benchmark of 0.90 was used to indicate 
“high” reliability. Any reliability rating 
below 0.80 was considered “poor.” 



6 



MEASURING PRINCIPAL PERFORMANCE 




_a> 

CO CD 

« txO 

CO -i— 1 r~ 

-§ I £ s 

^ =£ SS ° S 



o O -S £ O 

° o — 2 o 

D- O (C ^ ° 



< 

oi 



I s - 

E °? 

o o 



XxD O 
£Z 00 



< 

Cl5 



O 

CD 

O 



‘+TT 'OO O 
c h- 



£ CO 
CD Q- 

t< g 
cu E 
c= 110 
O c/5 

T 3 g 
$ £ 
-2 = 
co co 



Q. 

O 



CD 

TJ 



CD +± 
-i—' CO 

o 3 
o = 



■OJD >, 

1 ° 

_£- CO 

^ E 

"O .1= 

CD 'i- 



c o 



o 

o 

-2 ^ 
is .E 
co co 

CD 3 
CO d 

JP "D CO 

111. 

■4—1 __ CO 

« ro = 

=3 o CO 
-t— ' "C *— ■ 

CO ■= o 

c- Q. +_i 

s in 



txO 

d 

O 



TO 

CD 

-t— ■ 

CO 



co o 

3 « 



s 

E 

’> 

cd 

CD 

> 

‘co 

d 

CD 

-4— » 

X 

CD 

■a 

d 

co 



CD 

■a 



o 

O 



CD CD 

-a c 



E 3 



■aO 

d 

o 



_Q 

CO 



CO 

d 

CO 

o 



o 

o 



CD 

co 

CO 



O 

o 



"O 

CD 

-t— ' 

CO 

TO 

"co .sd 

> CO 

= CO 
CO d 
.CD CO 



CD CD 



3 3 to ^ 



O 

o 






5- 3 

CD 03 



O 

‘■4— ' 

CO 



3 co 

CO (1) 

c: co 
E co 

CD CD 

' i— 

'T "a 
h- -a 
h- co 



CO CD 

a 3 ■p 



_03 

CO 



CD 00 
-Q 00 
2 CD 



CO 



CD 



•OJD 
d 

CO _ _ 

o 3 o 



■4—' O' 

3 a? PO 



CO 

> 

CD 



O 



o 

co 



■OJD 

d 

'co 

8b 

*«- -d 
d CO 

O QJ) 

3 "o 

55 co 



CD CO 

-a cd 



CD O 
CD Vd 

•ab 2 

CD y 
■? -O 

o ° 

CO £Z 
CO o 



d 2 
CD CD 

■a ■£= 
2 o 
00 “a 



CO 



CO 

CO 



E £ ro 

£ -E Q. 

i Q- 



CO 

O 

CN O 



CM 

CD 

CD 



CD 



CD 

CO CD 

+— 1 i— 

■p o 

CD 1/3 
CC CO 



CD d 



CO CD 

CL .> 



CXO <D E F 

£ 58 « ■= 

Q < CO CL 



5- £3 

CD CO 



CO CO 
CD CD 
"CO TO 

cd -a 

CO co 



_CD 

"co 



CO 



CD 

Q. 



CO co 



h- 

oo 

CD 



CO 

d 

o 



< o 



MEASURING PRINCIPAL PERFORMANCE 



7 









CD 



O 

c o 



CO 

03 



CD 

— Cl 03 
LI -g O ^ 
0 ” E o 

£ s "o. .<« 



03 

CD 



CO 

03 

CD 

s' $ 



4-- TO U3 QJ 

tu ^ 2 o ^ 

■ 4 — 1 O 4 — (/) 

CD O ^ O CD 

03 CD 03 TO 

D r CJD 00 0 

C3 Q_ TO 00 j-~ 

^ < 2 d £ 



o 

oo 

o 

o 

03 

LO 



CD ^ 

Q- § 

< b if) 

- it ^ 

O CD O 

o o 



CD 



co 



■a 

03 



_Q CO 
CD — 
Y TO 



>4 -t — 1 

■j-J o 
"O TO 



03 O 
CO CD 

II 

CO CD 
03 s- 



CO C= 
CD CD 



CO TD 
CO CD 



CD TO 
> — 
^ CL 



to co 
m 1/5 

ni 



O CD 



TO tq/D P 
•OJ3 c 0 
TO -3 <= 



CD 

03 



TO QJ 

-TO C 

CO TO 



CD 03 

> 

■b-> o 



ja 

"CD 

> 



03 

’> 

03 



O 03 

O -O 



. to jr c 

TO > TO — 

B c 5 I 

® £ «o’« 
a> x E — 

-*-* 03 O CD 

= r E.E 

TO- -aJD CD i_ 

!c =3 o 
CO o C M - 
cd -to 2? E 
-a 4-1 £ 03 
CD "a TO *± 

03 03 TO (— 

- M G 
! D -n < TO 
TO = .03 

TO CD ^ 

+3 > CD O 

“</>■> Z 

E — £ o cu 

1 1 i s l 

“ § g- TO “ 

d u 03 ^ d 



_Q 

CZ 

£ 



T3 

"CD 




CD "TO 



=± CD 

cr -t— > 

03 CO 



o 

CN 



CO TO 

5R a. 



CD TO 
CD 



03 

CO 



O 

CO 



_Q 
TD 

' £3 o 

°2 c 

Q. CD 

E "O 

O TO 

O CD 



CD .> 



■a 

03 

CO 

Z3 

>4 

CD 

■a 



03 



CO 

03 

■a 

CD 

CD 



-TO 

CD 



CO 

03 



CO 

03 

■a 



o 

CO 



CO 

■ 4— 1 

DjO 

TO 03 
CD 

03 CO 

1 o 

o 

Q. '-TO 



TO CD '43 
"a -a to 

O CD 03 

□I to ;o 



TO CD 



co 

Q. 



” "TO m 

"TO ^ 

CD -TO 




CN 

O 

O 

CN 



LO 

00 

03 



"O LO 
TO 00 
TO 03 



CO cd 
'TO F 



03 
CD CD 



CD <C 



03 03 

CL DC 



O 

Q. 



■a co 

C >4 ~0 

CD CO LU 



CO 

< 

Cd 

CL 



CD 



03 



TO CD 

O E to 

CD TO < 03 L/3 

,9r “ «> 1M 

o E 5 s= 



~ </> ra 
dI = 2 



CD 



8 



MEASURING PRINCIPAL PERFORMANCE 



(PIMRS) • Widely used in the field correlations among items within a 

subscale than for the same items for 
other subscales. In addition, PIMRS 
scores are corroborated by school 
documents. 
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Findings 

The Internet-based scan of scholarly articles 
and books conducted identified 20 school 
principal performance assessments, which 
were intended for use in hiring, advancement, 
and tenure decisions. Of the 20 assessments, 
eight met criteria for rigor, which meant that 
the assessment development process was 
transparent and involved some psychometric 
testing, and measures were provided for review. 
Two of the eight assessments were developed 
in the past decade, and the remainder were 
developed 10-20 years ago. 

The scan suggests that, although there is 
considerable interest in school principal 
quality and accountability, few principal 
performance assessments have been rigorously 
developed or make details of psychometric 
testing available for public review. An 
explanation for the finding is that few 
assessments are being used in the field, but 




the findings of Goldring et al. (2009) suggest 
that many principal performance assessments 
of varying quality are being used. Unpublished 
assessments were not included in the scan. 

In addition, the age of instruments raises 
questions about their continued validity 
for assessing principal performance. Given 
the emphasis on instructional leadership, 
accountability, data-based decision making, 
community involvement, and other well- 
documented changes to the school principal 
position in the past 10 years, it is plausible that 
older measures do not capture essential features 
of the position. Changes in the position and 
additional research on principal effectiveness 
raise concerns and may be cause for 
revalidation of older assessments. 

The scan also highlights the different 
approaches to assessing school principal 
performance. The eight principal performance 
assessments measure the degree to which 
principals complete different roles. For 
example, CFSQ addresses principals’ 
roles as change facilitators, VAL-ED 
focuses on principals as instructional 
leaders, and PRAISE examines principal 
capacity to improve school-level systems. 
Each provides test takers and principal 
evaluators with slightly different perspectives 
on principal practices. 

In addition, the assessments take different 
approaches to data collection. Several 
measures use self-assessment questionnaires 
or rubrics that provide an aggregate score 
and help principals to answer the following 
question: “How do I think I am doing, in 
reference to professional competencies?” 
Others use more intensive 360-degree 
surveys from multiple constituents to 
create an aggregate profile, which can 
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provide comparative information based 
on multiple perspectives to principals about 
their performance. The use of different 
constituencies to rate principal performance 
is a growing trend (Lashway, 2003). These 
evaluations answer the following question: 
“How do I, and others , believe I am doing, 
in reference to professional competencies?” 

In conjunction with student achievement 
data, the performance assessments that 
are included in this review hold potential 
for raising principal accountability and 
identifying necessary changes in practice. 



However, principal performance assessment 
data will achieve desired ends only if 
principals and their supervisors view the data 
as credible and actionable and give assessment 
data considerable weight during principal 
performance evaluations. Close examinations 
of the principal performance evaluation 
process — its frequency and structure — would 
provide information about how assessments 
are used. In addition, this process would 
offer insight for assessment developers 
about how to structure assessment 
processes for better effects. 
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